cd/entity/SOC 2018ยท homeโ€บ entitiesโ€บ SOC 2018
grep -l @soc 2018 /news/*.json | wc -l โ†’ 1

@SOC 2018

mentions 1 type Person feed RSS
04:00
2026-06-06
arxiv.org
artificial-intelligence

Agents' Last Exam

Researchers introduced Agents' Last Exam (ALE), a new benchmark designed to evaluate AI agents on long-horizon, economically valuable, real-world tasks with verifiable outcomes. Developed with over 25โ€ฆ

// co-occurs with top 3 entities